Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data

ثبت نشده
چکیده

The Levenshtein dialect distance method has proven to be a successful method for measuring phonetic distances between Dutch dialects. The aim of the present investigation is to validate the Levenshtein dialect distance with perceptual data from a language area other than the Dutch, namely Norway. We calculate the correlation between the Levenshtein distances and the distances between 15 Norwegian dialects as judged by Norwegian listeners. We carry out this analysis to see the degree to which the average Levenshtein distances correspond to the psychoacoustic perception of the speakers of the dialects. In 1995, Kessler introduced the use of the Levenshtein distance as a tool for measuring linguistic distances between language varieties. He applied the algorithm to the comparison of Irish dialects. The Levenshtein distance is a string edit distance measure. On the basis of linguistic distances between dialectal varieties, dialect areas can be found. More innovative is the possibility of drawing dialect maps that reflect the fact that dialect areas should be considered as continua and not as areas separated by sharp borders. Its application to the Dutch language area has produced convincing results (see Heeringa, 2004; Nerbonne & Heeringa, 1998). The results are partly similar to the map of Daan and Blok (1969), which may be considered as the most authoritative Dutch dialect map up till now. Still, it is desirable to validate the method further. In this article we validate the Levenshtein distance. We will investigate to what extent dialect distances found with Levenshtein distance correlate with distances as perceived by the dialect speakers themselves. We will try to find an answer to the following question: May Levenshtein distance-based dialect distances be considered as a good approximation of the perceptual distances? To answer this question, we will use a set of 15 Norwegian varieties. Results for Dutch may be impressive, but the Dutch dialect area is a flat, regularly populated landscape. In contrast with this, the Norwegian dialect area is less The present article reports on part of a study supported by NWO, the Netherlands Organization for Scientific Research. We are grateful for the permission from Kristian Skarbø and Jørn Almberg to use their material and for the help of Jørn Almberg during the whole investigation. We thank Saakje van Dellen for her obliging help with the data entry and Peter Kleiweg for letting us use the programs that he developed for the visualization of the maps and dendrograms in this article. Finally, we would like to thank John Nerbonne for valuable comments and for correcting our English. Language Variation and Change, 16 (2004), 189–207. Printed in the U.S.A. © 2004 Cambridge University Press 0954-3945004 $9.50 DOI: 10.10170S0954394504163023

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data

The Levenshtein dialect distance method has proven to be a successful method for measuring phonetic distances between Dutch dialects. The aim of the present investigation is to validate the Levenshtein dialect distance with perceptual data from a language area other than the Dutch, namely Norway. We calculate the correlation between the Levenshtein distances and the distances between 15 Norwegi...

متن کامل

Measuring Norwegian dialect distances using acoustic features

Computational dialectometry has been proven to be useful for finding dialect relationships and identifying dialect areas. The first to develop a method of measuring dialect distances was Jean Séguy, assisted and inspired by Henri Guiter (Chambers and Trudgill, 1998). Strongly related to the methodology of Séguy is the work of Goebl, although the basis of Goebl’s work was developed mainly in dep...

متن کامل

Norwegian Dialects Examined Perceptually and Acoustically WILBERT HEERINGA

Gooskens (2003) described an experiment which determined linguistic distances between 15 Norwegian dialects as perceived by Norwegian listeners. The results are compared to Levenshtein distances, calculated on the basis of transcriptions (of the words) of the same recordings as used in the perception experiment. The Levenshtein distance is equal to the sum of the weights of the insertions, dele...

متن کامل

Dialect Pronunciation Comparison and Spoken Word Recognition

Two adaptations of the regular Levenshtein distance algorithm are proposed based on psycholinguistic work on spoken word recognition. The first adaptation is inspired by the Cohort model which assumes that the word-initial part is more important for word recognition than the word-final part. The second adaptation is based on the notion that stressed syllables contain more information and are mo...

متن کامل

Adaptive String Distance Measures for Bilingual Dialect Lexicon Induction

This paper compares different measures of graphemic similarity applied to the task of bilingual lexicon induction between a Swiss German dialect and Standard German. The measures have been adapted to this particular language pair by training stochastic transducers with the ExpectationMaximisation algorithm or by using handmade transduction rules. These adaptive metrics show up to 11% F-measure ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004